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Abstract 

The most common state space reconstruction method in the analysis 
of chaotic time series is the Method of Delays (MOD). Many techniques 
have been suggested to estimate the parameters of MOD, i.e. the time 
delay r and the embedding dimension m. We discuss the applicability 
of these techniques with a critical view as to their validity, and point 
out the necessity of determining the overall time window length, r™, for 
successful embedding. Emphasis is put on the relation between and the 
dynamics of the underlying chaotic system, and we suggest to set > Tp, 
the mean orbital period; Tp is approximated from the oscillations of the 
time series. The procedure is assessed using the correlation dimension for 
both synthetic and real data. For clean synthetic data, values of larger 
than Tp always give good results given enough data and thus Tp can be 
considered as a lower limit (r^, > Tp). For noisy synthetic data and real 
data, an upper hmit is reached for which approaches Tp for increasing 
noise amplitude. 



1 Introduction 

State space reconstruction is the first step in non-linear time series analysis of 
data from chaotic systems including estimation of invariants and prediction. For 
a recent review of these topics see ^ and Q . Reconstruction consists of viewing 
a time series Xk — x{kTs), k — 1, . . . , iV in a Euchdean space K™, where m is 
the embedding dimension and Tg is the sampling time. Doing this, we hope that 
the points in E™ form an attractor that preserves the topological properties of 
the original unknown attractor. A standard way to reconstruct the state space 
is the Method of Delays (MOD). Using MOD, each m-dimensional embedding 
vector is formed as = [xk,Xk+p, ■ ■ ■ , Xk+(rn-i)p]'^ where p is a multiple integer 
of Tg so that the delay time r equals pr^. H. The m coordinates of each point 
Xfe are samples from the time series (separated by a fixed r) covering a time 
window of length r^, = (to — l)r (or r^, = (m — l)p as multiple of r^). 
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The fundamental theorem of reconstruction, introduced first by Takens [Q 
Q and extended more recently in |^ , gives no restriction on r while for m states 
the sufhcicnt (but not necessary) condition m > 2d+l, where d is the fractal 
dimension of the underlying attractor ^. Takens' theorem is valid for the case of 
infinitely many noise-free data. In practice, however, with a limited number of 
possibly noisy observations, the selection of r and m is rather important for the 
quality of the reconstruction. Many methods have been suggested for estimating 
these parameters, but they are all empirical in nature and do not - as we show 
- necessarily provide appropriate estimates. This is a rather typical situation 
regarding state space reconstruction in general. 

While there will always be uncertainties related to reconstruction from real 
data, it is still important to try to improve the procedures. We suggest r^, as 
an independent parameter instead of focusing on the interrelated parameters 
r and m of MOD. The time window length is of particular importance since 
it determines, in a certain sense, the amount of information passed from the 
time series to the embedding vectors. For a given r^, one may then select a 
sufficiently large m. Suggestions for the selection of have been made in 0, 
@i [01 ini and jl^ but to our knowledge there has been little systematic 
work regarding this parameter. We give procedures for estimating Tw from 
the signal. Only time series from continuous systems are treated. For discrete 
systems, one typically sets p — 1, reducing the number of parameters to one - 
the embedding dimension, since =171 — 1. 

The quality of the reconstructions is assessed using the correlation dimension 
13 1 . The resulting reconstructions may not be the most suitable for other 
purposes such as estimation of Lyapunov exponents and prediction. However, 
with improved reconstructions for dimension estimation it is likely that the 
technique will be valuable also in other cases. 

In section ^, we discuss several of the methods suggested up to now for 
estimating r and m in MOD and comment on the underlying ideas as well 
as on the validity of the results. In section we establish the role of r^, in 
reconstruction and give simple ways to estimate it. Finally, in section |4[ the 
correlation dimension is used to assess the proposed procedure using noise-free 
and noise-corrupted synthetic data as well as real data. 



2 Suggested methods for estimating the MOD- 
parameters 

A very helpful approach in visualizing the reconstruction problem is to consider 
the reconstruction as an orthogonal projection from some high p-dimensional 
state space onto an TO-diniensional subspace defined by the m coordinates of the 
reconstructed vectors. Defining the linear mapping B : Rp — > M™, from each 
p-dimensional vector x^. to an m-dimensional vector x™, we have x™ = Bx^, 
where the rows of the mxp matrix B are orthonornial. The p coordinates of 
x^ are actually all the samples in the time window and in the case of MOD, 
where p — 1 = = (m — l)p, the m coordinates of the projected subspace are 

^Similar work was made independently in 

^Actually, Takens' condition uses [d] instead of d, the topological dimensipji, i.e. the lower 
integer greater than d. The use of d in the inequality has been established in fcl allowing lower 
values for m. 
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every p'th sample starting with the first, i.e. each row of B has one 1 and p—1 
zeros. Obviously, one can find other m-dimensional subspaces using a smaller 
p (which may not cover the whole t^)- Using p—1 results in an unfavorable 
reconstruction if the time series is densely sampled because then the attractor 
lies on the diagonal in K™. (The successive samples differ very little from 
each other.) In such a projection we utilize only the m first samples of Tw 
Other projections may be considered such as the one employed in the Singular 
Spectrum Approach (SSA) 0]. This method yields first a transformation of 
the natural coordinate system to another orthogonal system, ranking the p 
new directions according to the variance they explain, followed by a projection 
onto the m first directions. The rows of the B matrix are then the first m 
eigenvectors of the pxp sample covariance matrix of the embedding vectors. 
The reconstruction viewed as a projection from the hyperspace determined by 
T„ reveals the importance of this parameter. For MOD, the subspace is defined 
completely by the parameters t (or p) and m and for SSA by p and m. 

Certain statements supporting current methods for estimating r and m have 
been widely accepted and almost adopted as axioms. We do not intend to 
question all the existing methodology on MOD state space reconstruction, but 
feel that a discussion is needed regarding the guidelines used to choose the 
parameters. 

2.1 Comments on the selection of the delay time 

Consider first r and the two following widely accepted criteria: 

1. The reconstructed attractor must be expanded from the diagonal (imply- 
ing that r should not be too small) but not too much so that it folds back 
(implying that r should not be too large). 

2. The components of the vector must be uncorrelated. 

Note the similarity of the two criteria: increasing r expands the attractor 
from the diagonal and the components get less correlated; beyond some range of 
T, folding may occur and the components again get correlated. These goals are 
intuitively reasonable for m — 2, while the generalization to a larger m is not 
always straightforward as we show below. Many methods based on geometric 
properties seek the t that makes the attractor cover the largest region or expands 
it maximally from the diagonal ||l2|, However, the goal of stretching 
the attractor from the diagonal to get "good" reconstructions is based rather on 
empirical than theoretical grounds. In theory, a good reconstruction means near 
topological equivalence of the reconstructed attractor to the original one. One 
way to assess topological equivalence is to check whether stretching and folding 
are proportionally the same in the two attractors. In practice, this is done by 
checking whether the inter-distances of points remain proportionally the same 
in the two attractors or, alternatively, by checking whether nearby points on the 
original attractor remain relatively close on the reconstructed attractor. This 
last property is not always preserved when we expand the attractor from the 
diagonal, even for proper expansions according to the two above criteria. We 
show this for the Lorenz system jl^ in Fig.^ Fig.|l^ shows that when r is 
very small (r — 0.01) the reconstructed attractor lies almost on the diagonal 
and the points are generally getting closer than the corresponding points on the 
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Distance of points on the original attractor Distance of points on tfie original attractor Distance of points on tfie original attractor 



Figure 1: Correlation diagrams of the distances of points on the original attrac- 
tor (x-axis) and on the reconstructed attractor (y-axis) for the Lorenz system. 
Results are shown for 10% of the 20000 data points sampled with Tj = 0.01 
time units. For each point on the original attractor the distance from its near- 
est neighbor is computed and keeping track of the time indices the distance 
of the corresponding points on the reconstructed attractor is then found. The 
attractor is reconstructed with MOD, m — 3 and p = 1 in (a), p = 18 in (b), 
and p = 9 in (c). 



original attractor. One expects that this problem is resolved when we expand 
the attractor sufficiently (r = 0.18 which gives the minimum of the so-called 
mutual information - see below). But the opposite phenomenon is observed 
instead as shown in Fig.|l|b, i.e. points that are close on the original attractor 
become more distant on the reconstructed attractor. Further, we show in Fig.|l|c 
that the distances are more balanced for the reconstruction with a comparably 
small value of r (r = 0.09) which is not apparent from the two above criteria. 
The point we want to infer from this remark is that there is not necessarily 
a meaningful answer to the question: Why should we seek the r that gives 
sufficient expansion from the diagonal? Expansion per se does not guarantee a 
configuration of the reconstructed attractor closer to the original one. 

Concerning the second criterion, the estimates for r are based either on linear 
decorrelation, choosing r such that R{t) — 0, where R is the autocorrelation 
function]^, or general decorrelation choosing r to be the first minimum of the 
mutual information /(r) as developed in | p^ . These two methods guarantee 
decorrelation (linear or general) between two successive components Xk and Xk+r 
of the reconstructed vector x^. But even if Xk and Xk+r are uncorrelated and 
Xk+T and Xk+2T are uncorrelated, it does not follow that Xk and Xk+2T are also 
uncorrelated. As an example, we show in Fig.|^ R and / for a time series from 
the Taylor-Couette experiment in the chaotic regime which exhibits strong 
decorrelation for some lag r and strong correlation for lag 2r. We believe that 
the behavior of the correlation functions in Fig.|2|are often met in applications 
since chaotic time series from low dimensional systems frequently show pseudo- 
periodicities. 



^Other values of R(t) such as R{t) = 1/e have also been suggested but used little in 
applications, e.g. see [h7|. 
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Figure 2: Autocorrelation function i?(r) in (a) and mutual information /(r) in 
(b) for a time series of 10000 data measured from the Taylor- Couette experiment 
in the chaotic regime. Note the approximate matching of the zeros of R to 
minimums of / and the extremes of R to maximums of / indicating a dominant 
linear correlation. Moreover, note that the first decorrelation time is for p ~ 20 
while for p ~ 40 there is maximum correlation. 



One may be confronted also with other problems attempting to estimate r: 
the autocorrelation function may get approximately zero only after an extremely 
long time, as for the x- variable of the Lorenz system, or the mutual information 
may not have a clear minimum, as is the case with the physiological data used 
below. 



2.2 Comments on the selection of the embedding dimen- 
sion 

The standard way to find m is to use some criterion which the geometry of 
the attractor must meet and check for which embedding dimension m* this is 
fulfilled as the attractor is embedded in successively higher dimensional spaces. 
Then m* is the lowest embedding dimension to be used for reconstruction. 
Obviously, in estimating m, t is fixed when MOD is used. 

Among different geometrical criteria (including also the correlation dimen- 
sion), the most popular seems to be the method of "False Nearest Neighbors" 
(FNN) developed in |Q and enhanced recently i n The rationale behind 

this method has also been discussed in |Q and ||23|. This criterion concerns 
the fundamental condition of no self-intersections of the reconstructed attrac- 
tor. The original attractor lies on a smooth manifold of dimension \d\ . Self- 
intersections of the reconstructed attractor indicate that it does not lie on a 
smooth manifold and thus the reconstruction is not successful. The condition of 
no self-intersections states that if the attractor is to be reconstructed success- 
fully in R"*, then all neighbor points in M"* should also be neighbors in M™+^. 
The method checks the neighbors in successively higher embedding dimensions 
until it finds only a negligible number of false neighbors when increasing the 
dimension from m* to m* + 1. This to* is chosen as the lowest embedding di- 
mension that gives reconstructions without self-intersections. However, the fact 
that the distances between neighboring points do not change when measured 
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in R"* and in does not necessarily mean that these points are also true 

neighbors on the original attractor. 

Specifically, one has to consider the interdependence of m and r. The esti- 
mation of m depends on the selection of r (p) as we show in Fig.|^ for the Lorenz 
system. The proportion of false nearest neighbors does not fall to zero for the 
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Figure 3: Plot of the percent of false neighbors detected as the embedding 
dimension is increased for different values of r. The algorithm of FNN has been 
implemented for a time series of 10000 samples of the x variable of the Lorenz 
system. The different curves correspond to the time delays given in the legend as 
multiple of the sampling time = 0.01. The horizontal stippled line shows the 
1% level of false neighbors which is often used as the discriminative threshold 
value. 

same m as r increases but rather the estimated m increases slowly with t. Thus, 
the estimation of m is somewhat arbitrary unless the method finds the same m 
for a sufficiently large range of r values. For a very small r, there is a typical 
underestimation of m. Such a r forces the attractor to lie near the diagonal in 
R™. Increasing m by one has little effect on the geometry of the attractor as it 
will still lie near the diagonal of R™+^. All the points will apparently look as 
true neighbors leading to a wrong conclusion. 

The method is very sensitive to noise giving larger values of m for noisy data 
as pointed in Q and Q. In fact, the effect of noise is greater for larger values 
of T. This is a serious drawback of the method because in real applications we 
are led to choose a larger m than we really need. This problem is particularly 
relevant for MOD, where the projections are chosen without regard to noise 
filtering which is partly accomplished using SSA-reconstructions . 

Another method that has been suggested to estimate m is based on truncat- 
ing the singular spectrum of SSA (for details see and Q). In fact, the idea 
behind this linear approach is, given the hyperspace of dimension p, to find the 
smallest subspace (hyperplane) that approximately bounds the attractor. This 
subspace is spanned by the eigenvectors corresponding to the largest eigenvalues 
of the sample covariance matrix, i.e. the directions where the attractor has the 
largest variance. However, a strange attractor lies on a manifold which occupies 
all directions in the embedded space (very much like noise) and a clear cut-off 
is not expected [ p6[ . On the other hand, if this approach is implemented locally 
it can reveal the dimension of the tangent space to the manifold and the aver- 
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aging over a grid of local regions can give a robust estimate of m as shown in 
[ p7[ . However, this estimate depends on the choice of the dimension p of the 
hyperspace, i.e. the time window length t^,. 

From these remarks we conclude that many of the existing methods for 
estimating t and m are based on somewhat arbitrary criteria and do not always 
guarantee good reconstructions. The performance depends on the problem at 
hand. 



3 The time window length - Tu, 

When analysing a time series one typically begins with an initial reconstruction, 
and implements a non-linear method to this and other modified reconstructions 
until a stable result is attained. Here we concentrate on the time window length 
Tiu to determine the reconstruction. 

There is probably no uniquely best way to choose an initial r^,. We will 
argue that it may be reasonable to set equal to the "memory" of the system, 
i.e. the measurement record needed to determine future observations as reliably 
as possible. For practical reasons, one would like the shortest possible r„. 
Geometrically, one could associate such a Tu, with the mean orbital period Tp, 
i.e. the mean time between two consecutive visits to a local neighborhood. For 
low-dimensional chaotic systems showing pseudo-periodicity, the mean orbital 
period could naturally be associated with the mean time between visiting a 
Poincare section. 

For several chaotic systems, Tp carries significant information about the dy- 
namics. For systems that generate attractors with a sheet-like structure in R'^ 
(see for example [^), it can be shown that the Poincare section gives points 
that in a scatter plot lie approximately on a curve, which is the one dimensional 
manifold that embeds an attractor very much like the strange attractor of the 
logistic map. The same result may be obtained by selecting the points from the 
extremes or maxima of the time series directly instead of using reconstruction 
and Poincare section. This has been shown for the Lorenz system |^ and the 
Rossler system |3^. We found similar results studying the oscillations of other 
systems with sheet-like structure, such as the Rabinovich-Fabrikant system pl[ 
and the Mackey Glass system for A ~ 17 (for details of this system see 
below). 

As indicated above, the procedure suggested here requires only an initial 
estimate of which is subsequently adjusted. Given only a set of observations, 
a very simple solution is to select the initial r„ as the mean time between peaks 
(tbp) of the original time series. In general, tbp will be less than Tp, and thus 
it is natural to consider tbp a lower limit. For a low dimensional system, e.g. 
defined asymptotically in R'^, it is reasonable to assume that an orbital period 
corresponds to an oscillation when projected down to the observed axis, and 
thus Tp = tbp. For more complicated systems in higher dimensional spaces, a 
complete orbit may form more than one oscillation. In that case, Tp should be 
estimated as the average over a pattern of oscillations. 

The equation of Mackey Glass |^ 

0.2x(t - A) 

^ ' • 0.1x{t) (1) 



1 + [a;(i- A)]io 
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is a good example to show how one can find lower limits for from the oscil- 
lations of the time series. This time delay differential equation was discretized 
following the iterative scheme in , and segments of the time series for differ- 
ent A are shown in Fig.H with solid grey lines. 



(a) (b) (c) 
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Figure 4: The solid grey lines in all three figures are for segments of the Mackey 
Glass time series for different A and the stippled lines after smoothing with a 
k-FlR filter, (a) A = 17, = 1, and k = 10. (b) A = 30, r, = 1, and k ^ 30. 
(c) A = 100, T, = 1, and k = 80. 

For A — 17, the attractor is low dimensional (d ~ 2 ||l3|) and an orbital 
period can be assumed to correspond to a single oscillation (solid grey line in 
Fig.IJa). Then Tp can be easily estimated as tbp after filtering the time series to 
avoid close peaks that do not correspond to distinct oscillations (stippled black 
line in Fig.|^a), and thus for A = 17 we can conclude that > Tp — tbp ~ 50 
time units. 

For A = 30, the attractor has a higher dimension (d ~ 3 [^) and as Fig.^ 
shows, in many parts of the time series there are systematic variations over 
a pattern of oscillations (often comprised of a small and a large oscillation), 
approximately repeating itself. Filtering gives a new time series with one peak 
for each such pattern, facilitating the computation of Tp from the thp of the 
filtered time series giving t^, > Tp ~ 100. 

For A = 100 in Fig.^, the attractor is much more complicated {d ~ 7.1 
|33| ) and therefore it is difficult to observe patterns of oscillations that repeat 
themselves (but not as difficult as to make Poincare sections). However, in some 
particular parts of the time series, consecutive similar patterns may be observed 
showing implicit correspondence to orbital periods (see Fig.^). Hard filtering 
allows us even to assign a peak to each pattern giving t^, > Tp ~ 330. Note 
that filtering is performed only in order to discern the representative peaks, 
especially for higher dimensional systems. Noisy time series should be filtered 
anyway, before estimating Tp to avoid the fake peaks that are due to noise. 

Up to this point we have assumed that the measurement function is well 
defined according to Takens' generic assumptions, so that the oscillations in the 
observed time series do reflect the periodic-like orbits of the original system and 
vice versa. However, this is not always the case and as an example of a "good" 
and "bad" mapping let us consider the x and z variable of the Rossler system 
|34[ (see Fig.pl). In the time series of the x variable, the oscillations represent 
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200 400 600 800 1000 

time index k 

Figure 5: (a) A trajectory of the Rossler system in R'^. (b) Measurements 
of the z variable of the trajectory, (c) Measurements of the x variable of the 
trajectory. Note that the oscillations of the time series in (b) do not reveal all 
orbital periods associated with the trajectory while in (c) they do. 

the real orbits while in the time series of the z variable the orbital periods 
can hardly be recognized. In the latter case, an analysis will fail to identify the 
correct attributes of the system unless a very large amount of data is provided to 
compensate for the bad mapping. We found, for example, that for measurements 
over the same epoch, the correlation dimension of the Rossler attractor was well 
estimated by the x-measurements but significantly underestimated by the z- 
measurements due to the "knee" phenomenon we discuss below. 

We here suggest working directly in the time domain to estimate instead 
of considering periods corresponding to dominant frequencies as suggested by 
and Chaotic data will in general not show well defined frequency peaks. 
Other suggestions regarding t^, have been presented in the literature 1^, |pl| 
and Some attempted to estimate based on decorrclation criteria from 
the autocorrelation function and the mutual information p5| and |]3^ . In 
one paper treating this issue, 1^, lower and upper limits for where based on 
the autocorrelation function and it was proposed to set < r^, < 4tc, where Tc 
is the correlation time defined as the delay where the autocorrelation function 
is 1/e. This lower limit is much smaller than Tp for most systems. An upper 

limit for was given in by 2y^^^^^, where (x'-^-') and (a;'^') are the mean 
values of the time series and its first derivative, respectively. We found that for 
many systems this upper limit is also smaller than Tp. 

4 Correlation dimension and r^y 

We now discuss the use of r^, in the time series analysis. A natural procedure 
is to start with an initial and perform calculations - in this case computing 
the correlation dimension v ~ for a sufficiently large m. Then r^, is modified, 
the calculations repeated, and so on. To be able to conclude that a valid result 
has been obtained, reasonably stable values have to be found over a range of r^, 
values. 

First we define the correlation integral C(r), a statistic that measures the 
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fraction of points on the attractor being less than r units apart 



C{r) 



1 



N 

N(N -I) ^ 

i,j=l,\i-]\>K 



(2) 



where 9(x) is the Heaviside function, defined as Q{x) — 1 for a; > and 
Q{x) — for X < 0, and K is used to omit time-correlated points in the 
computation of C(r). The Euclidean norm is used because it gives more robust 
results in the presence of noise For deterministic systems, the correlation 
integral scales as C(r) ^ r'', where theoretically r — > 0. Preferably, v should be 
estimated from the slope of the graph of log C (r) against log r over a sufficient 
range [ri,r2] of small interdistances. However, due to noise or to limited data, 
an approximately constant slope may be maintained only for larger values of ri 
and r2 . We chose r2 /ri — 4 for the length of the interval, and searched over all 
such intervals to find the one where the computed v varied least^. The mean 
value of the slope in this interval is the estimated and it is always reported 
together with the standard deviation (shown with bars in the following figures). 

A key observation is that the estimate of the correlation dimension of a 
chaotic time series (clean or noisy) is approximately the same under variations 
of the parameters p and m while keeping Tw = {m — l)p fixed (assuming that 
m is always larger than the dimension of the attractor) . Only few workers seem 
to have thought along these lines (||], and ||l^). The typical features 

are demonstrated in Fig. ^ which shows the correlation dimension estimates for 
different Tw for clean and noisy data from the Lorenz system. Note how the 
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Figure 6: Plot of the correlation dimension estimate for MOD reconstruction 
with different Tu, for time series of the x variable of the Lorenz system. The 
bars denote the standard deviation of the estimate. In each figure the grey 
curve with grey error bars correspond to p = 2 while the black ones to p = 10. 
In (a) the estimation is based on the clean time series of 4000 data sampled 
with Ts = 0.02 and in (b) on the same data but corrupted with 5% noise. The 
horizontal stippled line shows the correct plateau for f = 2.06 and the shaded 
area the confidence interval of ±5% of the correct i/. 

grey and black curves match for the clean data in Fig.^. They correspond 

^To compute the slope for each r we use the best fit slope for three values, the current r, 
the previous and the next. 
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to the same but with p — 2 and p — 10, respectively. Once r^, and thus 
the p-dimensional hyperspace, has been determined, the particular projection 
chosen is not critical as long as the projection is sufficient, i.e. m > v and 
p ~ T = This is so, because the interdistances of points remain 

" m— 1 m— 1 ' ^ 

statistically the same in Rp and in R"*. Considering all the coordinates or only 
the selected subset has the same effect on the computation of the interdistance 
as long as a suitable norm is used, e.g. the Euclidean norm ||3^ . 

When white noise is added to the clean Lorenz data (Fig.^) the two curves 
still match but now show an increasing trend with Tu, . The estimation of v is 
more sensitive to the choice of Tw in the presence of noise. 

Results for the estimation of i' from noisy data or few data (compared to the 
minimum number of data required) should be interpreted with caution because 
they are derived from scaling properties based on large r. For smaller r, the 
scaling is corrupted by noise or distorted due to few neighbors in state space. In 
the case of attractors with different scaling properties for small and large r (a 
phenomenon referred to as a "knee" erronous estimates are obtained from 

the scaling for large r when noise or insufficient data length mask the correct 
scaling for small r. Such a phenomenon is observed for the z-measurements of 
the Rossler system mentioned before. The correct scaling (i/ ~ 2.01) can be only 
detected for very small inter-point distances r requiring a very large number of 
data, otherwise another scaling is detected for larger r, underestimating v. 

The estimation of v, even when it is constrained only to large r, is not 
straightforward as it varies with r^, and a typical situation is shown in Fig. ^ 
for Lorenz system. Too small t^, (r^, — 20) or too large {t^ = 160) gives 
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Figure 7: Plot of the slope of the graph log C(r) against log?- for the time series 
of 4000 data from the Lorenz system, sampled with Tg = 0.02 and with 5% 
additive noise. The three curves are derived from reconstructions with p = 10 
and TO = 3 (minimum embedding dimension), to = 9 and m = 17 and are 
identified by the length of r^, marked on the figure. The scaling interval of least 
variation is denoted with the black solid line segment for each slope curve. The 
grey area shows roughly the region where inter-point distances are corrupted 
by noise leaving a small interval of r to estimate v and making the choice of r^, 
critical. The horizontal stippled line shows the correct plateau for v ~ 2.06. 

uncertain and wrong estimates while for Tu, larger than but still close to Tp = 50 
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(here — 80)^ the scahng is clear indicating a reHable estimate. On the other 
hand, the range of suitable depends on the length of the time series; the 
longer the time series, the broader the limits for Tw Noise also restricts 
from above because the slope curves derived for increasing r^, do not saturate. 
Setting a criterion for the acceptance of the i/-estimate, e.g. ±5% of the correct 
value, an upper limit t„ for the range of may be found which varies with the 
amplitude of the noise (e.g. r„ ~ 110 for Fig. gj). It is thus expected that the 
scaling gets distorted as icreases over r„ giving less confident estimates as 
shown with the slope curve for = 160 in Fig.0. So, when the time series is 
corrupted with noise, the z/-estimates are more biased and the interval [Tp,T„] 
of the accepted r^, shrinks from above, and it may be no reliable estimate of v 
for any if the impact of noise is so large that t„ decreases to the level of r^. 

Thus when estimating v from a limited number of noisy data we seek the 
range of Tw that gives clear scaling for large r keeping in mind that the results 
are still ambiguous due to the possible different scaling for small and inaccessible 
r (the "knee" phenomenon). In the sequel, we consider in more detail simulated 
data corrupted with noise as well as real data. 

4.1 Noisy synthetic data 

Most of the time series we use here have length iV — 4000 adjusting the sam- 
pling time Ts accordingly in order to have enough oscillations as well as enough 
samples for each oscillation. It follows that the number of data points is not 
the best measure of the record length. We therefore also quote the number of 
Tp within the record, denoted #rp, together with the number of samples in Tp. 
Note that under changes of the reconstruction parameters or the noise ampli- 
tudes, the values r\ and ti of the scaling interval [ri,r2] that gives i^-estimates 
with least variance may change as well. 

Results for the time series from the a;-variable of the Lorenz system with 
Ta = 0.02 and Tp ~ 50 and #rp ~ 80 were shown in Fig.|[ For the clean 
data, legitimate estimates of v (within ±5% of the correct v = 2.06 shown 
as a shaded zone in the figure) were obtained for a large interval of Tu, values 
beginning even lower than Tp. As Tw is increased long beyond Tp the estimates 
increase somewhat and have larger variance. When 5% white Gaussian noise is 
added to these data, the correlation dimension is underestimated significantly 
for r^, < Tp, and for > Tn — 110, v is overestimated with larger variance. 

The attractor derived from the a;- variable of the Rossler system has a simpler 
structure than the Lorenz attractor and about the same dimension. However, 
estimates of v are more dependent on the reconstruction parameters and the 
amplitude of the noise. The time series is sampled with t^ = 0.1 that gives 
60 samples in each oscillation and about 66 oscillations, which are comparable 
to the Tp and #Tp for the Lorenz data. In Fig.^, the z^-estimates are plotted 
against Tw for the clean and noisy Rossler data displayed with grey and black 
error bars respectively, together with the ±5%-zone of the accepted range of v. 
Here, as well as in the following estimations, we keep p fixed (p = 20 in Fig. ||) 
and vary m. This is done for convenience since the results are essentially the 
same for other combinations of p and m (refer back to Fig.||). For the clean 

^ For this time series the periods of the oscillations vary a lot and thus the estimate Tp has 
large variance and does not completely indicate the "memory of the system" . 
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Figure 8: Plot of the correlation dimension estimate v for MOD reconstruction 
with different for time series of the x variable of the Rossler system. The grey 
curve with grey error bars correspond to the clean data and the black to the 
same data corrupted with 2% noise. Here, iV = 4000, = 0.1 and p — 20. The 
horizontal stippled line shows the correct plateau for v = 2.01 and the shaded 
area the confidence interval of ±5% of the correct v. 



data, reasonable and confident z^-estimates can be found for a small range of 
Tp = 60 < < 140 (the grey error bars in the ±5%-zone in the figure). When 
just 2% white noise is added to these data, the small horizontal plateau seems 
to disappear (the black line in Fig. ^ and only (/-estimates close to Tp and above 
can be accepted, which is in accordance with the proposed t^,. 

The Mackey Glass attractor for A = 17 has the same dimensionality as 
the two last attractors but gives less biased estimates of v. For Ts = 1, we 
found Tp = 50 and #rp ~ 80 from single oscillations. In Fig. H, results from 
the estimation of v are presented in the same way as for the Rossler data. For 
the clean data, a very reliable i^-estimate is derived over a large interval of t^^,, 
[20, 160] (from the ±5%-criterion). When 5% noise is added, confident estimates 
are obtained only close to Tp, and when 10% noise is added, reasonable estimates 
are only obtained for ~ Tp. 

When A = 30, the dimension of the attractor increases to ;/ 2± 3 p^ . 
However, using iV — 4000 and = 2 an underestimate [v ~ 2.5) was found. 
For this t^, the Tp estimated with the mean time for patterns of two oscillations 
(cf. section||) is kept down to Tp — 50 and #Tp ~ 80, as for A = 17. The results 
from estimation of v for clean and noisy data with 5% and 10% noise (shown 
in Fig. 10 a,) assert the use of Tp as a lower limit for t^, and the decrease of the 
interval of accepted values for t^, from above and towards Tp as the amplitude 
of the added noise is increased. The underestimation of v is due to the limited 
number of data. This attractor shows a "knee" structure, i.e. it has also another 
scaling (the correct v ~ 3.0) for small r which can be detected only when many 
data are accumulated as shown in Fig.[l0|b. The slope for too small t^, (t^ = 24) 
underestimates v while for t^ > Tp the correct scaling is achieved (shown with 
the two curves for = 48 and t^ = 168 in the figure). Note that these curves 
form a second scaling for larger r. 

For A = 100, the Mackey Glass attractor gets high dimensional with v ~7 
[p3|. Our results show a slightly lower v with as few as TV = 4000. We sampled 



13 



3 



1.5 





clean 
— 5% noise 
-- 10% noise 




















I 





















50 100 150 200 

time window length \ 



Figure 9: Plot of the correlation dimension estimate u for MOD reconstruction 
with different t^, for time series of the Mackey Glass equation for A = 17. The 
solid grey curve with solid grey error bars correspond to the clean data, the solid 
black to the noisy data with 5% noise and the stippled grey to the noisy data 
with 10% noise. Here, N = 4000, = 1 and p = 12. The horizontal stippled 
line shows the correct plateau iox v = 2 and the shaded area the confidence 
interval of ±5% of the correct v. 

the discretized system with = 10 in order to have enough, but not too many, 
samples within the estimated mean orbital period, ~ 33, giving as many as 
#rp ~ 120 repititions of the oscillation pattern that is assumed to correspond to 
an orbit of the underlying system. We deliberately keep the data record down to 
N = 4000 in order to test our procedure for short time series (compared to the 
high dimensionality of the system). The estimated v is an increasing function 
of Tw both in values and uncertainty, showing some stability in value and in 
variance for Tp ~ 30 < r^, < 45. This is, however, an underestimation of v, 
possibly due to insufficient data (see Fig.|l]). Adding 5% noise does not alter 
the i/-estimates but just increases moderately the uncertainty of the estimates; 
when 10% noise is added, the z^-estimates for r^, > Tp vary significantly from 
those of the clean data. 

These findings, as well as results for the Rabinovich-Fabrikant system pl[ , 
and the four-dimensional Rossler Hyperchaos system not shown here, con- 
firm our suggestion for estimating Tu, with Tp giving the best estimates of v. If 
the effect of noise or limited length of the time series is such that estimation 
of I' can be made only for a short range of values, this is close to and little 
larger than Tp. 

4.2 Real data 

In addition to simulated data, observations from physical controlled experiments 
on low dimensional deterministic processes should be used to assess the validity 
of non-linear methods. The noise level is often insignificant in such cases. Here 
we use a time series of N = 4000 samples from the Taylor Couette experiment in 
the chaotic regime. We estimated Tp ~ 75 and #Tp ~ 54, but the results for the 
estimation of v do not change for longer time records covering more oscillations 
(increasing either N or if we insist on keeping N small). Contrary to most of 
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Figure 10: (a) Plot of the correlation dimension estimate v for MOD recon- 
struction with different for time series of the Mackey Glass equation for 
A = 30. The solid grey curve with solid grey error bars correspond to the clean 
data, the solid black to the noisy data with 5% noise and the stippled grey to 
the noisy data with 10% noise. Here, iV = 4000, = 2 and p = 12. The 
horizontal stippled line shows the correct plateau for i/ ~ 3.0 and the shaded 
area the confidence interval of ±5% of the underestimated ~ 2.5. (b) Plot of 
the slope of the graph log C(r) against logr for the same type of data but for 
N = 30000. The three curves are derived from reconstructions with p = 12 and 
m = 3, TO = 6 and to = 17 and are identified by the length of r^, marked on 
the figure. The scaling interval of least variation is denoted with the black solid 
line segment for each slope curve. The two horizontal stippled lines show the 
two scalings of this attractor. 



the previous results from simulated data with noise, the estimated v varies little 
with Tyj as shown in Fig. For all t^, > Tp the etsimates are more or less fixed 
to ~ 2.6, approximately the value given in the literature [p^ , with a slowly 
increasing uncertainty for r^, > 150. This indicates that there is little noise in 
the data and the dimension of the chaotic attractor can be identified even with 
large (up to 2rp), so that the choice of is not critical. However, when 
we add noise to these data, to simulate a larger experimental uncertainty, the 
estimates have as expected a larger variance, but for close to Tp the estimates 
are the same as for the original time series. For larger there is a systematic 
overestimation of showing again that the optimal for correct estimation is 
close to Tp. 

We now turn to observational data that are not output of a controlled ex- 
periment, and concentrate on physiological data of the Electroencephalogram 
(figl3) from epileptic patients (e.g. see [^). Dimension estimation of physio- 
logical data has been a hot subject the last years. However, the results to date 
are not promising, partly because different procedures are often used giving dif- 
ferent i^-estimates for the same type of data, and partly because these data do 
not seem to share the same nice chaotic properties as the well-studied simulated 
data 1^ . Previous work on i^-estimation of EEG epileptic signals reported low 
dimensional attractors of varying dimension between 2 and 6, according to the 
physiological nature of the data, the data acquisition process, the computational 
scheme of estimation, as well as the parameter setting for reconstruction (p3], 
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Figure 11: Plot of the correlation dimension estimate v for MOD reconstruction 
with different for time series of the Mackey Glass equation for A = 100. The 
solid grey curve with solid grey error bars correspond to the clean data, the solid 
black to the noisy data with 5% noise and the stippled grey to the noisy data 
with 10% noise. Here, N = 4000, = 10 and p = 4. The horizontal stippled 
line shows the correct plateau for ~ 7.1 and the shaded area the confidence 
interval of ±5% of the correct v. 



H, H and H). 

Here, we use a short time series from an epileptic seizure of iV = 3400 
data sampled with Ts = O.OOSsec. The oscillations of the time series evolve 
irregularly, so the estimated tbp ~ 30 does not seem to be directly related to 
Tp. With a more thorough examination of the sequence of oscillations, we can 
distinguish patterns of oscillations that may correspond to orbital periods of 
the potential underlying attractor. In Fig.p^ we show a part of the time series 
where such patterns are apparent. After severe filtering, the time corresponding 
to each pattern can be estimated by the tbp for the filtered time series giving 
Tp ~ 110. Other parts of the time series are not so regular but still patterns 
of about the same time length can be identified qualitatively. The standard 
estimation procedure applied to these data gave no clear saturation of the i'- 
estimate for increasing r^,, (grey curve in Fig.|l^). The estimate increases 
with increasing variance showing some flatness for a small region of values of 

around 100. In fact, for > 100 there is scaling but over a shorter interval 
of interdistances [rl,r2] not satisfying the more stringent criterion r2/rl = 4. 
Relaxing this to r2/rl — 2, which has previously been used for EEG signals pGj , 
a clear saturation with ~ 4 is established for Tu, > 100, though with increasing 
variance (Fig.p^). Thus, the optimal choice of for the computation of 
should be around 100, which is close to Tp = 110, the estimate of r^, from the 
oscillations of the time series. Note that these results are not general for epileptic 
EEG signals. Other EEG data showed very poor scaling and no saturation for 
increasing even for r2/r'l = 2 |^ giving no valid estimate for i/. In these 
cases, no patterns of oscillations could be observed. 
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Figure 12: Plot of the correlation dimension estimate v for MOD reconstruction 
with different for time series from the Taylor Couette experiment in the 
chaotic regime. The solid grey curve with solid grey error bars correspond to 
the original data, the solid black to the original data corrupted with 5% noise 
and the stippled grey to the original data corrupted with 10% noise. Here, 
p = 20 is chosen for reconstructions varying with m. The horizontal stippled 
line shows the correct plateau for v ~ 2.6 and the shaded area the confidence 
interval of ±5% of the correct v. 

5 Conclusions 

Our analysis in section 2 showed that when one reconstructs with MOD, effective 
techniques for determining the delay time r and the lowest embedding dimension 
m are lacking. Concerning r, there is no standard indication of which value is 
the most appropriate. In fact, if we allow m to be very large, we can even use a 
very small r in the reconstruction. It seems that instead of relying on estimates 
for T (such as the zero of the autocorrelation function or the minimum of mutual 
information) and m (such as the estimate from false nearest neighbors) one could 
rather employ "trial and error". In fact, this seems to be common in practice. 

A more systematic and less tedious way to make reconstructions has been 
proposed here focusing on the time window length Tw We argued that Tw 
is the first parameter to be determined when reconstructing the state space 
and suggested that it should be approximated by the mean orbital period Tp. 
For low dimensional attractors, Tp is set to the time between peaks thp, easily 
calculated by averaging the time between successive maxima of the time series. 
Noisy time series may be filtered before determining thp. For higher dimensional 
and more complicated systems, the mean orbital period may be found from 
coherent patterns of oscillations. Computationally, this can be done measuring 
the "period" of such oscillating patterns, or applying strict filtering so that each 
pattern becomes one oscillation, and then compute the thp. 

With the estimation of and a sufficiently large m, the reconstruction 
is completely defined and can be used for further analysis of the time series. 
Regarding the correlation dimension, an initial estimate may be derived with 
Tw — Tp, and then checking whether the same estimate is obtained when is 
increased. For noisy data, the estimate remains the same only for close to Tp, 
as noise sets an upper limit to r^,. The proposed parameter setting turned out 
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Figure 13: (a) Segment of the EEG time series of an epileptic seizure sampled 
with Tg — O.OOSsec (solid grey line) and after smoothing with a 40 point FIR 
filter (stippled black line), (b) Plot of the correlation dimension estimate ly for 
MOD reconstruction with different r^, for EEG time series in epileptic seizure. 
The grey curve with grey error bars correspond to estimation over a scaling 
interval [ri,r2] with r2/ri = 4 while the black curve with black error bars 
correspond to r2/ri = 2. The other parameters are N = 3400, = 0.005 and 
p= 10. 

to give the most confident i^-estimates for all data analyzed where estimation 
was possible. 
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